step decay
OntheConvergenceofStepDecayStep-Sizefor StochasticOptimization
Step decay step-size schedules (constant and then cut) are widely used in practice because of their excellent convergence and generalization qualities, but their theoretical properties are not yet well understood. Weprovide convergence results for step decay in the non-convexregime, ensuring that the gradient norm vanishes at an O(lnT/ T)rate.
Country:
Technology:
Country:
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Middle East > Jordan (0.04)
Technology:
Country:
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > New York (0.04)
- (2 more...)
Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)